4.5: Non

您所在的位置：网站首页 › what is context free grammar › 4.5: Non

4.5: Non

#4.5: Non| 来源: 网络整理| 查看: 265

We have seen that there are context-free languages that are not regular. The natural question arises, are there languages that are not context-free? It’s easy to answer this question in the abstract: For a given alphabet Σ, there are uncountably many languages over Σ, but there are only countably many context-free languages over Σ. It follows that most languages are not context-free. However, this answer is not very satisfying since it doesn’t give us any example of a specific language that is not context-free.

As in the case of regular languages, one way to show that a given lan- guage L is not context-free is to find some property that is shared by all context-free languages and then to show that L does not have that prop- erty. For regular languages, the Pumping Lemma gave us such a property. It turns out that there is a similar Pumping Lemma for context-free lan- guages. The proof of this lemma uses parse trees. In the proof, we will need a way of representing abstract parse trees, without showing all the details of the tree. The picture

屏幕快照 2019-06-19 13.32.42.png

represents a parse tree which has the non-terminal symbol A at its root and the string x along the “bottom” of the tree. (That is, x is the string made up of all the symbols at the endpoints of the tree’s branches, read from left to right.) Note that this could be a partial parse tree—something that could be a part of a larger tree. That is, we do not require A to be the start symbol of the grammar and we allow x to contain both terminal and non-terminal symbols. The string x, which is along the bottom of the tree, is referred to as the yield of the parse tree. Sometimes, we need to show more explicit detail in the tree. For example, the picture

屏幕快照 2019-06-19 13.36.35.png

represents a parse tree in which the yield is the string xyz. The string y is the yield of a smaller tree, with root B, which is contained within the larger tree. Note that any of the strings x, y, or z could be the empty string.

We will also need the concept of the height of a parse tree. The height of a parse tree is the length of the longest path from the root of the tree to the tip of one of its branches.

Like the version for regular languages, the Pumping Lemma for context- free languages shows that any sufficiently long string in a context-free language contains a pattern that can be repeated to produce new strings that are also in the language. However, the pattern in this case is more complicated. For regular languages, the pattern arises because any sufficiently long path through a given DFA must contain a loop. For context-free languages, the pattern arises because in a sufficiently large parse tree, along a path from the root of the tree to the tip of one of its branches, there must be some non-terminal symbol that occurs more than once.

Theorem 4.7

(Pumping Lemma for Context-free Languages). Suppose that L is a context-free language. Then there is an integer K such that any string \(w \in L(G)\) with \(|w| \geq K\) has the property that w can be written in the form w = uxyzv where

x and z are not both equal to the empty string; |xyz| < K; and For any \(n \in \mathbb{N},\) the string \(u x^{n} y z^{n} v\) is in \(L\)

Proof. Let \(G=(V, \Sigma, P, S)\) be a context-free grammar for the languageL. Let N be the number of non-terminal symbols in G, plus 1. That is, \(N=|V|+1\). Consider all possible parse trees for the grammar G with height less than or equal to N. (Include parse trees with any non-terminal symbol as root, not just parse trees with root S.) There are only finitely many such parse trees, and therefore there are only finitely many different strings that are the yields of such parse trees. Let K be an integer which is greater than the length of any such string.

Now suppose that w is any string in L whose length is greater than or equal to K. Then any parse tree for w must have height greater than N. (This follows since \(|w| \geq K\) and the yield of any parse tree of height \(\leq N\) has length less than \(K\). Consider a parse tree for w of minimal size, that is one that contains the smallest possible number of nodes. Since the height of this parse tree is greater than N, there is at least one path from the root of the tree to tip of a branch of the tree that has length greater than N. Consider the longest such path. The symbol at the tip of this path is a terminal symbol, but all the other symbols on the path are non- terminal symbols. There are at least N such non-terminal symbols on the path. Since the number of different non-terminal symbols is |V | and since \(N=|V|+1\), some non-terminal symbol must occur twice on the path. In fact, some non-terminal symbol must occur twice among the bottommostN non-terminal symbols on the path. Call this symbol A. Then we see that the parse tree for w has the form shown here:

屏幕快照 2019-06-19 13.42.43.png

The structure of this tree breaks the string w into five substrings, as shown in the above diagram. We then have w = uxyzv. It only remains to show that x, y, and z satisfy the three requirements stated in the theorem.

Let T refer to the entire parse tree, let \(T_{1}\) refer to the parse tree whose root is the upper A in the diagram, and let \(T_{2}\) be the parse tree whose root is the lower A in the diagram. Note that the height of is \(T_{1}\) less than or equal to N. (This follows from two facts: The path shown in T1 from its root to its base has length less than or equal to N, because we chose the two occurrences of A to be among the N bottommost non-terminal symbols along the path in T from its root to its base. We know that there is no longer path from the root of T1 to its base, since we chose the path in Tto be the longest possible path from the root of T to its base.) Since any parse tree with height less than or equal to N has yield of length less than \(K,\) we see that \(|x y z|

【本文地址】

4.5: Non

4.5: Non

今日新闻

推荐新闻